As this is the very first exercise in this workshop, it is relatively easy and short. Its purpose is to get used to this exercise format and, more importantly, to install all necessary packages for this course.
You can find the solutions for this exercise and all other exercises in the ./solutions folder in the repo/directory that contains the course materials. You can copy code from these exercise files by clicking on the small blue clipboard icon in the code boxes’ upper right corner.
Note: We recommend you to use the ./MY_CODE folder to store your R-script files for this course. If you do that, you have to make sure that your script uses the root directory of the course materials as working directory. You can check the current working directory with getwd() and set it with setwd() accordingly.
A presumably more elegant solution would be to use RStudio and open the tidyverse-workshop-esra-2021.Rproj to open it as a project. In this case, the path is already set.
And here comes our very first exercise:
tidyverse package.
easypackages packages, which can be installed with the command install.packages("easypackages"). After loading the package with library(easypackages) you can load and install packages with the command easypackages::packages("fancy_package_1", "fancy_package_2", ...).
easypackages::packages(
"tidyverse"
)
## All packages loaded successfully
And here’s another quick exercise.
R’s built-in dataset USAarrests and print it.
USAarrests directly in the console or use the print() function.
data("USArrests")
USArrests
## Murder Assault UrbanPop Rape
## Alabama 13.2 236 58 21.2
## Alaska 10.0 263 48 44.5
## Arizona 8.1 294 80 31.0
## Arkansas 8.8 190 50 19.5
## California 9.0 276 91 40.6
## Colorado 7.9 204 78 38.7
## Connecticut 3.3 110 77 11.1
## Delaware 5.9 238 72 15.8
## Florida 15.4 335 80 31.9
## Georgia 17.4 211 60 25.8
## Hawaii 5.3 46 83 20.2
## Idaho 2.6 120 54 14.2
## Illinois 10.4 249 83 24.0
## Indiana 7.2 113 65 21.0
## Iowa 2.2 56 57 11.3
## Kansas 6.0 115 66 18.0
## Kentucky 9.7 109 52 16.3
## Louisiana 15.4 249 66 22.2
## Maine 2.1 83 51 7.8
## Maryland 11.3 300 67 27.8
## Massachusetts 4.4 149 85 16.3
## Michigan 12.1 255 74 35.1
## Minnesota 2.7 72 66 14.9
## Mississippi 16.1 259 44 17.1
## Missouri 9.0 178 70 28.2
## Montana 6.0 109 53 16.4
## Nebraska 4.3 102 62 16.5
## Nevada 12.2 252 81 46.0
## New Hampshire 2.1 57 56 9.5
## New Jersey 7.4 159 89 18.8
## New Mexico 11.4 285 70 32.1
## New York 11.1 254 86 26.1
## North Carolina 13.0 337 45 16.1
## North Dakota 0.8 45 44 7.3
## Ohio 7.3 120 75 21.4
## Oklahoma 6.6 151 68 20.0
## Oregon 4.9 159 67 29.3
## Pennsylvania 6.3 106 72 14.9
## Rhode Island 3.4 174 87 8.3
## South Carolina 14.4 279 48 22.5
## South Dakota 3.8 86 45 12.8
## Tennessee 13.2 188 59 26.9
## Texas 12.7 201 80 25.5
## Utah 3.2 120 80 22.9
## Vermont 2.2 48 32 11.2
## Virginia 8.5 156 63 20.7
## Washington 4.0 145 73 26.2
## West Virginia 5.7 81 39 9.3
## Wisconsin 2.6 53 66 10.8
## Wyoming 6.8 161 60 15.6
tibble and print it. Compare it to the previous output.
tibble::as_tibble() function.
USArrests_tibble <-
tibble::as_tibble(USArrests)
USArrests_tibble
## # A tibble: 50 x 4
## Murder Assault UrbanPop Rape
## <dbl> <int> <int> <dbl>
## 1 13.2 236 58 21.2
## 2 10 263 48 44.5
## 3 8.1 294 80 31
## 4 8.8 190 50 19.5
## 5 9 276 91 40.6
## 6 7.9 204 78 38.7
## 7 3.3 110 77 11.1
## 8 5.9 238 72 15.8
## 9 15.4 335 80 31.9
## 10 17.4 211 60 25.8
## # ... with 40 more rows
dplyr::glimpse(). What do you think happens to the output?
dplyr::glimpse(USArrests_tibble)
## Rows: 50
## Columns: 4
## $ Murder <dbl> 13.2, 10.0, 8.1, 8.8, 9.0, 7.9, 3.3, 5.9, 15.4, 17.4, 5.3, 2.6, 10.4, 7.2, 2.2, 6.0, 9.7, 15.4, 2.1, 11.3, 4.4, 12.1, 2.7, 16.1~
## $ Assault <int> 236, 263, 294, 190, 276, 204, 110, 238, 335, 211, 46, 120, 249, 113, 56, 115, 109, 249, 83, 300, 149, 255, 72, 259, 178, 109, 1~
## $ UrbanPop <int> 58, 48, 80, 50, 91, 78, 77, 72, 80, 60, 83, 54, 83, 65, 57, 66, 52, 66, 51, 67, 85, 74, 66, 44, 70, 53, 62, 81, 56, 89, 70, 86,~
## $ Rape <dbl> 21.2, 44.5, 31.0, 19.5, 40.6, 38.7, 11.1, 15.8, 31.9, 25.8, 20.2, 14.2, 24.0, 21.0, 11.3, 18.0, 16.3, 22.2, 7.8, 27.8, 16.3, 35~
# dplyr::glimpse() provides another method of displaying the data. In such small
# datasets, in doesn't make a hughe difference. But as tibbles' output are
# reduced for large datasets, it provides a convenient method of getting a quick
# glimpse (haha) on the data.
dplyr::glimpse() in one %>%-workflow.
x %>% f(.)
data("USArrests")
USArrests %>%
tibble::as_tibble() %>%
dplyr::glimpse()
## Rows: 50
## Columns: 4
## $ Murder <dbl> 13.2, 10.0, 8.1, 8.8, 9.0, 7.9, 3.3, 5.9, 15.4, 17.4, 5.3, 2.6, 10.4, 7.2, 2.2, 6.0, 9.7, 15.4, 2.1, 11.3, 4.4, 12.1, 2.7, 16.1~
## $ Assault <int> 236, 263, 294, 190, 276, 204, 110, 238, 335, 211, 46, 120, 249, 113, 56, 115, 109, 249, 83, 300, 149, 255, 72, 259, 178, 109, 1~
## $ UrbanPop <int> 58, 48, 80, 50, 91, 78, 77, 72, 80, 60, 83, 54, 83, 65, 57, 66, 52, 66, 51, 67, 85, 74, 66, 44, 70, 53, 62, 81, 56, 89, 70, 86,~
## $ Rape <dbl> 21.2, 44.5, 31.0, 19.5, 40.6, 38.7, 11.1, 15.8, 31.9, 25.8, 20.2, 14.2, 24.0, 21.0, 11.3, 18.0, 16.3, 22.2, 7.8, 27.8, 16.3, 35~